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Abstract.  Suppose  that  a  target  function  /o  :  M'^  — >  K  is  monotonic,  namely,  weakly 
increasing,  and  an  original  estimate  /  of  the  target  function  is  available,  which  is  not 
weakly  increasing.  Many  common  estimation  methods  used  in  statistics  produce  such 
estimates  /.  We  show  that  these  estimates  can  always  be  improved  with  no  harm  using 
rearrangement  techniques:  The  rearrangement  methods,  univariate  and  multivariate, 
transform  the  original  estimate  to  a  monotonic  estimate  /*,  and  the  resulting  estimate 
is  closer  to  the  true  curve  /o  in  common  metrics  than  the  original  estimate  /.  We 
illustrate  the  results  with  a  computational  example  and  an  empirical  example  dealing 
with  age-height  growth  charts. 
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1.  Introduction 

A  common  problem  in  statistics  is  to  approximate  an  unknown  monotonic  function 
on  the  basis  of  available  samples.  For  example,  biometric  age-height  charts  should  be 
monotonic  in  age;  econometric  demand  functions  should  be  monotonic  in  price;  and 
quantile  functions  should  be  monotonic  in  the  probability  index.  Suppose  an  original, 
possibly  non-monotonic,  estimate  is  available.  Then,  the  rearrangement  operation  from 
variational  analysis  (Hardy,  Littlewood,  and  Polya  1952,  Lorentz  1953,  Villani  2003) 
can  be  used  to  monotonize  the  original  estimate.  The  rearrangement  has  been  shown 
to  be  useful  in  producing  monotonized  estimates  of  conditional  mean  functions  (Dette, 
Neumeyer,  and  Pilz  2006,  Dette  and  Pilz  2006)  and  various  conditional  quantile  and 
probability  functions  (Chernozhukov,  Fernandez- Val,  and  Galichon  (2006a,  2006b)).  In 
this  paper,  it  is  shown  that  the  rearrangement  of  the  original  estimate  is  useful  not 
only  for  producing  monotonicity,  but  also  has  the  following  important  property:  The 
rearrangement  always  improves  over  the  original  estimate,  whenever  the  latter  is  not 
monotonic.  Namely,  the  rearranged  curves  are  always  closer  (often  considerably  closer) 
to  the  target  curve  being  estimated.  Furthermore,  this  improvement  property  is  generic, 
i.e.  it  does  not  depend  on  the  underlying  specifics  of  the  original  estimate  and  applies 
to  both  univariate  and  multivariate  cases. 

The  paper  is  organized  as  follows.  In  Section  2.1,  we  motivate  the  monotonicity 
issue  in  regression  problems,  and  discuss  common  estimates/ approximations  of  regression 
functions  that  are  not  naturally  monotonic.  In  Section  2.2,  we  analyze  the  improvements 
in  estimation/ approximation  properties  that  the  rearranged  estimates  deliver.  In  Section 
2.3,  we  discuss  the  computation  of  the  rearrangement,  using  sorting  and  simulation.  In 
Section  2.4,  we  extend  the  analysis  of  Section  2.2  to  multivariate  functions.  In  Section  3, 
we  provide  proofs  of  the  main  results.  In  Section  4,  we  present  an  empirical  application  to 
biometric  age-height  charts.  We  show  how  the  rearrangement  monotonizes  and  improves 
the  original  estimates  of  the  conditional  mean  function  in  this  example,  and  quantify 
the  improvement  in  a  simulation  example  resembling  the  empirical  one.  In  the  same 
section,  we  also  analyze  estimation  of  conditional  quantile  processes  for  height  given  age 
that  need  to  be  monotonic  in  both  age  and  the  quantile  index.  We  apply  a  multivariate 
rearrangement  to  doubly  monotonize  the  estimates  both  in  age  and  the  quantile  index. 
We  show  that  the  rearrangement  monotonizes  and  improves  the  original  estimates,  and 


quantify  the  improvement  in  a  simulation  example  mimicking  the  empirical  example.  In 
Section  5  we  offer  a  summary  and  a  conclusion. 

2.  Improving  Approximations  op  Monotonic  Functions 

2.1.  Common  Estimates  of  Monotonic  Functions.  A  basic  problem  in  many  ar- 
eas of  analysis  is  to  approximate  an  unknown  function  /o  :  M''  — >  R  on  the  basis  of 
some  available  information.  In  statistics,  the  common  problem  is  to  approximate  an 
unknown  regression  function,  such  as  the  conditional  mean  or  a  conditional  quantile,  us- 
ing an  available  sample.  In  numerical  analysis,  the  common  problem  is  to  approximate 
an  intractable  target  function  by  a  more  tractable  function  on  the  basis  of  the  target 
function's  values  at  a  collection  of  points. 

Suppose  we  know  that  the  target  function  /o  is  monotonic,  namely  weakly  increasing. 
Suppose  further  that  an  original  estimate  /  is  available,  which  is  not  necessarily  mono- 
tonic. Many  common  estimation  methods  do  indeed  produce  such  estimates.  Can  these 
estimates  always  be  improved  with  no  harm?  The  answer  provided  by  this  paper  is  yes: 
the  rearrangement  method  transforms  the  original  estimate  to  a  monotonic  estimate 
/*,  and  this  estimate  is  in  fact  closer  to  the  true  curve  /o  than  the  original  estimate  / 
in  common  metrics.  Furthermore,  the  rearrangement  is  computationally  tractable,  and 
thus  preserves  the  computational  appeal  of  the  original  estimates. 

Estimation  methods,  specifically  the  ones  used  in  regression  analysis,  can  be  grouped 
into  global  methods  and  local  methods.  An  example  of  a  global  method  is  the  series 
estimator  of  /o  taking  the  form 

where  Pfc„(x)  is  a  fc„-vector  of  suitable  transformations  of  the  variable  x,  such  as  B- 
splines,  polynomials,  and  trigonometric  functions.  Section  4  lists  specific  examples  in 
the  context  of  an  empirical  example.  The  estimate  b  is  obtained  by  solving  the  regression 
problem 

n 

■  6  =  argmin5]p(y,-Pfc„(X0'6), 

1  =  1 

where  {Yi,Xi),i  =  1,  ...,n  denotes  the  data.  In  particular,  using  the  square  loss  p('u)  =  u^ 
produces  estimates  of  the  conditional  mean  of  Yi  given  Xi  (Gallant  1981,  Andrews 


1991,  Stone  1994,  Newey  1997),  while  using  the  asymmetric  absolute  deviation  loss 
p{u)  =  {u  —  l{u  <  0))u  produces  estimates  of  the  conditional  u-quantile  of  Yi  given  Xi 
(Koenker  and  Bassett  1978,  Portnoy  1997,  He  and  Shao  2000).  Likewise,  in  numerical 
analysis  "data"  often  consist  of  values  Yi  of  a  target  function  evaluated  at  a  collection 
of  mesh  points  {Xi,i  =  l,,n}  and  the  mesh  points  themselves.  The  series  estimates 
X  H- >  f{x)  =  Pk^[x)'h  are  widely  used  in  data  analysis  due  to  their  good  approximation 
properties  and  computational  tractability  However,  these  estimates  need  not  be  natu- 
rally monotone,  unless  explicit  constraints  are  added  into  the  optimization  program  (for 
example,  Matzkin  (1994),  Silvapulle  and  Sen  (2005),  and  Koenker  and  Ng  (2005)). 

Examples  of  local  methods  include  kernel  and  locally  polynomial  estimators.  A  kernel 
estimator  takes  the  form 

n  /  Y    —      \ 

f{x)  =  arg  min  J^  Wip{Yi  -b),    Wi  =  K  i     '         j  , 

where  the  loss  function  p  plays  the  same  role  as  above,  K{u)  is  a  standard,  possibly 
high-order,  kernel  function,  and  /i  >  0  is  a  vector  of  bandwidths  (see,  for  example. 
Wand  and  Jones  (1995)  and  Ramsay  and  Silverman  (2005)).  The  resulting  estimate 
X  I— >  f[x)  needs  not  be  naturally  monotone.  Dette,  Neumeyer,  and  Pilz  (2006)  show 
that  the  rearrangement  transforms  the  kernel  estimate  into  a  monotonic  one.  We  further 
show  here  that  the  rearranged  estimate  necessarily  improves  upon  the  original  estimate, 
whenever  the  latter  is  not  monotonic.  The  locally  polynomial  regression  is  a  related 
local  method  (Chaudhuri  1991,  Fan  and  Gijbels  1996).  In  particular,  the  locally  linear 
estimator  takes  the  form 

(/(x),  d{x))  =  argmin  V]  Wip{Yi  -  6  -  d{Xi  -  x))^    Wi  =  K  [  —^ — 
6eR,deR   ~[  \      n 

The  resulting  estimate  x  ^— >  f{x)  may  also  be  non-monotonic,  unless  explicit  constrains 
are  added  to  the  optimization  problem.  Section  4  illustrates  the  non-monotonicity  of 
the  locally  linear  estimate  in  an  empirical  example. 

In  summary,  there  are  many  attractive  estimation  and  approximation  methods  in  sta- 
tistics that  do  not  necessarily  produce  monotonic  estimates.  These  estimates  do  have 
other  attractive  features  though,  such  as  good  approximation  properties  and  computa- 
tional tractability.  Below  we  show  that  the  rearrangement  operation  applied  to  these 
estimates  produces  (monotonic)  estimates  that  improve  the  approximation  properties  of 


the  original  estimates  by  bringing  them  closer  to  the  target  curve.  Furthermore,  the  re- 
arrangement is  computationally  tractable,  and  thus  preserves  the  computational  appeal 
of  the  original  estimates. 


2.2.  The  Rearrangement  and  its  Approximation  Property:  The  Univariate 
Case.  In  what  follows,  let  A'  be  a  compact  interval.  Without  loss  of  generality,  it  is 
convenient  to  take  this  interval  to  be  A:"  =  [0,1].  Let  f{x)  be  a  measurable  function 
mapping  X  to  K,  a  bounded  subset  of  E.  Let  F/(y)  =  J.^  ^{fiu)  <  y}du  denote  the 
distribution  function  of  f{X)  when  X  follows  the  uniform  distribution  on  [0, 1].  Let 

r  (x)  :=  Qf{x)  :=  inf  {y  E  R  :  Fj{y)  >  x] 

be  the  quantile  function  of  Ff{y).  Thus, 


f*{x)  :=inf  <^y  e 


Hf{u)  <  y}du 


X 


>   X 


This  function  /*  is  called  the  increasing  rearrangement  of  the  function  /. 

Thus,  the  rearrangement  operator  simply  transforms  a  function  /  to  its  quantile  func- 
tion /*.  That  is,  X  i— >  f*{x)  is  the  quantile  function  of  the  random  variable  f{X)  when 
X  ~  [7(0, 1).  It  is  also  convenient  to  think  of  the  rearrangement  as  a  sorting  operation: 
given  values  of  the  function  /(x)  evaluated  at  x  in  a  fine  enough  net  of  equidistant 
points,  we  simply  sort  the  values  in  an  increasing  order.  The  function  created  in  this 
way  is  the  rearrangement  of  /. 

The  main  point  of  this  paper  is  the  following: 

Proposition  1.  Let  fo'-X—^Kbea  weakly  increasing  m.easurable  function  in  x,  where 
K  is  a  bounded  subset  of  M.  This  is  the  target  function.  Let  f  :  X  -^  K  be  another 
measurable  function,  an  initial  estimate  of  the  target  function  /q. 

1.  For  any  p  G  [l,oo],  the  rearrangement  of  f ,  denoted  f* ,  weakly  reduces  the  estimation 
error: 


U  X 


nx)-fo{x) 


dx 


\Ip 


< 


X 


fix)  -  foix) 


dx 


i/p 


(2.1) 


2.  Suppose  that  there  exist  regions  Xq  and  Xq,  each  of  measure  greater  than  5  >  0,  such 
that  for  all  x  E  Xq  and  x'  6  Xq  we  have  that  (i)  x'  >  x,  (ii)  /(x)  >  f{x')  +  t,  and  (Hi) 


fo{x')  >  fo{x)  +  e,  for  some  e  >  0.    Then  the  gain  in  the  quality  of  approximation  is 
strict  for  p  E  ( 1 ,  oo) .  Nam.ely,  for  any  p  G  [1 ,  oo] , 


U  X 


nx)-fo{x) 


p    ^ 

dx 

i/p 
< 

\[ 

. 

Jx 

fix)  -  foix) 


dx  —  5r]p 


i/p 


(2.2) 


where  rjp  =  inf{|t'  —  t'|^  +  \v'  —  t\^  —  \v  —  t\^  —  \v'  —  t'\P}  and  rjp  >  0  for  p  E  (1,  oo),  with 
the  infimum  taken  over  all  v,  v',  t,  t'  in  the  set  K  such  that  v'  >  v  +  e  and  t'  >t  +  e. 

The  first  part  of  tfie  proposition  states  the  weak  inequahty  (2.1),  and  the  second  part 
states  the  strict  inequahty  (2.2).  For  example,  the  inequahty  is  strict  for  p  G  (l,oo)  if 
the  original  estimate  f{x)  is  decreasing  on  a  subset  of  PC  having  positive  measure,  while 
the  target  function  /o(a;)  is  increasing  on  X  (by  increasing,  we  mean  strictly  increasing 
throughout).  Of  course,  if  /o(x)  is  constant,  then  the  inequality  (2.1)  becomes  an  equal- 
ity, as  the  distribution  of  the  rearranged  function  /*  is  the  same  as  the  distribution  of 
the  original  function  /,  that  is  Ft,  =  Ft. 

This  proposition  establishes  that  the  rearranged  estimate  /*  has  a  smaller  estimation 
error  in  the  L^  norm  than  the  original  'estimate  whenever  the  latter  is  not  monotone. 
This  is  a  very  useful  and  generally  applicable  property  that  is  independent  of  the  sample 
size  and  of  the  way  the  original  estimate  /  is  obtained. 

An  indirect  proof  of  the  weak  inequality  (2.1)  is  a  simple  but  important  consequence 
of  the  following  classical  inequality  due  to  Lorentz  (1953):  Let  q  and  g  be  two  functions 
mapping  X  to  K,  a  bounded  subset  of  M.  Let  q*  and  g*  denote  their  corresponding 
increasing  rearrangements.  Then, 


L[q*{x),g*{x),x)dx<    /    L{q{x),g{x),x)dx, 
X  Jx 

for  any  submodular  discrepancy  function  L  :  M^  i-^  R.  Set  q{x)  =  f{x),  q*{x)  — 
f*{x),  g{x)  =  fo{x),  and  g*{x)  =  f^ix).  Now,  note  that  in  our  case  foix)  =  fo{x) 
almost  everywhere,  that  is,  the  target  function  is  its  own  rearrangement.  Moreover, 
L{v,w,x)  =  \w  —  v\^  is  submodular  for  p  G  [1,00).  This  proves  the  first  part  of  the 
proposition  above.  For  p  =  00,  the  first  part  follows  by  taking  the  limit  as  p  ^  00. 

In  Section  3  we  provide  a  proof  of  the  strong  inequality  (2.2)  as  well  as  the  direct  proof 
of  the  weak  inequality  (2.1).  The  direct  proof  illustrates  how  reductions  of  the  estimation 


error  arise  from  even  a  partial  sorting  of  the  values  of  the  estimate  /.  Moreover,  the 
direct  proof  characterizes  the  conditions  for  the  strict  reduction  of  the  estimation  error. 

The  following  immediate  implication  of  the  above  finite-sample  result  is  also  worth 
emphasizing:  The  rearranged  estimate  /*  inherits  the  Lp  rates  of  convergence  from  the 
original  estimates  /.  For  p  G  [l,oo],  if  A„  =  [/^  \fo{x)  -  !{x)\^(hi^l'P  =  Op{an)  for  some 
sequence  of  constants  a„,  then  [J^,  |/o(s)  -  j*{x)fdv)^l^  <  A„  =  Opia^^. 

2.3.  Computation  of  the  Rearranged  Estimate.  One  of  the  following  methods  can 
be  used  for  computing  the  rearrangement.  Let  {X^^j  =  1,...,B}  be  either  (1)  a  net  of 
equidistant  points  in  [0, 1]  or  (2)  a  sample  of  i.i.d.  draws  from  the  uniform  distribution 
on  [0, 1].  Then  the  rearranged  estimate  f*{u)  at  point  u  e  P^  can  be  approximately- 
computed  as  the  u-quantile  of  the  sample  {f{Xj),j  =  1,...,B}.  The  first  method  is 
deterministic,  and  the  second  is  stochastic.  Thus,  for  a  given  number  of  draws  B,  the 
complexity  of  computing  the  rearranged  estimate  f*{u)  in  this  way  is  equivalent  to  the 
complexity  of  computing  the  sample  u-quantile  in  the  sample  of  size  B. 

The  number  of  evaluations  B  can  depend  on  the  problem.  Suppose  that  the  den- 
sity function  of  the  random  variable  f{X),  when  X  ~  U{0, 1),  is  bounded  away  from 
zero  over  a  neighborhood  of  f*{x).  Then  f*{x)  can  be  computed  with  the  accuracy 
of  0p(l/\/5),  as  B  ^f  oo,  where  the  rate  follows  from  the  results  of  Knight  (2002). 
As  shown  in  Chernozhukov,  Fernandez- Val,  and  Galichon  (2006a),  the  density  of  f{X), 
denoted  Fj{t),  exists  if  f{x)  is  continuously  differentiable  and  the  number  of  elements 
in  {x  G  A"  :  f'{x)  =  0}  is  bounded;  in  particular, 

m)=      E      ijhfr  (2,3) 

Thus,  the  density  F'^{t)  is  bounded  away  from  zero  if  f'{x)  is  bounded  away  from  infinity. 
Interestingly,  the  density  has  infinite  poles  at  {i  G  -^  :  there  is  an  x  such  that  f'{x)  = 
0  and  f{x)  =  t}. 

2.4.  The  Rearrangement  and  Its  Approximation  Property:  The  Multivariate 

Cfise.  In  this  section,  we  consider  multivariate  functions  /  :  A''^  -^  K,  where  A""^  = 
[0, 1]*^  and  K  is  a.  bounded  subset  of  M.  The  notion  of  monotonicity  we  seek  to  impose 
on  /  is  the  following:  We  say  that  the  function  /  is  weakly  increasing  in  x  if  f{x')  >  f{x) 
whenever  x'  >  x.  The  notation  x'  =  {x\,  ...,x'^)  >  x  =  {xi,  ...,Xd)  means  that  one  vector 


is  weakly  larger  than  the  other  in  each  of  the  components,  that  is,  x'j  >  Xj  for  each 
j  =  1, ...,  d.  In  what  follows,  we  use  the  notation  f{xj,X-j)  to  denote  the  dependence  of 
/  on  its  j-th  argument,  Xj,  and  all  other  arguments,  X-j,  that  exclude  Xj.  The  notion 
of  monotonicity  above  is  equivalent  to  the  requirement  that  for  each  j  in  l,...,d  the 
mapping  Xj  i-^  f{xj,X-j)  is  weakly  increasing  in  Xj,  for  each  x^j  in  X'^^^. 

Define  the  rearranged  operator  Rj  and  the  rearranged  function  fj{x)  with  respect  to 
the  j-th  argument  as  follows; 


fnx):=R,of{x):=mf\y: 


[  l{f{xr,x.j)<y}dxr 


>Xj 


This  is  the  one-dimensional  increasing  rearrangement  applied  to  one-dimensional  func- 
tion Xj  h^  f{xj,X-j),  holding  the  other  arguments  x^j  fixed.  The  rearrangement  is 
applied  for  every  value  of  the  other  arguments  x_j. 

Let  TT  =  (jTi,  ...jTTd)  be  an  ordering,  i.e.  a  permutation,  of  the  integers  1,  ...,d.  Let  us 
define  the  yr-rearrangement  operator  i?^  and  the  7r-rearranged  function  f*{x)  as  follows: 

f:ix)  :=  R,  o  f{x)  :=  R^,  c.oR^^o  f{x). 

For  any  ordering  tt,  the  7r-rearrangement  operator  rearranges  the  function  with  respect 
to  all  of  its  arguments.  As  shown  below,  the  resulting  function  fn{x)  is  weakly  increasing 
in  x. 

In  general,  two  different  orderings  tt  and  it'  of  l,...,d  can  yield  different  rearranged 
functions  f*{x)  and  f*i[x).  Therefore,  to  resolve  the  conflict  among  rearrangements 
done  with  different  orderings,  we  may  consider  averaging  among  them:  letting  11  be  any 
collection  of  distinct  orderings  tt,  we  can  define  the  average  rearrangement  as 

/*(^)^=m5I/^*(^)'  (2-4) 

where  |n|  denotes  the  number  of  elements  in  the  set  of  orderings  11.  As  shown  below,  the 
approximation  error  of  the  average  rearrangement  is  weakly  smaller  than  the  average  of 
approximation  errors  of  individual  7r-rearrangements. 

The  following  proposition  describes  the  properties  of  multivariate  yr-rearrangements: 

Proposition  2.  Let  the  target  function  fo  :  X'^  -^  K  be  weakly  increasing  and  measur- 
able in  X.    Let  f  :  X'^  -^  K  he  a  measurable  function  that  is  an  initial  estimate  of  the 


target  function  /q.  Let  f  :  X'^  -^  K  he  another  estimate  of  fo,  which  is  measurable  in  x, 
including,  for  example,  a  rearranged  f  with  respect  to  som.e  of  the  arguments.   Then, 

1.  For  each  ordering  tt  of  I,  ...,d,  the  it -rearranged  estimate  f*{x)  is  weakly  increasing 
in  X.  Moreover,  f*{x),  an  average  of  n -rearranged  estimates,  is  weakly  increasing  in  x. 

2.  (a)  For  any  j  in  l,...,d  and  any  p  in  [l,oo],  the  rearrangement  of  f  with  respect 
to  the  j-th  argum.ent  produces  a  weak  reduction  in  the  approximation  error: 


X'i 


\rAx)-f,{x)Ydx 


\Ip 


< 


x<^ 


|/»  -  f^{x)\Ux 


1/p 


(2.5) 


(h)  Consequently,  a  n -rearranged  estimate  f*{x)  of  f{x)  weakly  reduces  the  approxi- 
mation error  of  the  original  estimate: 


\f:{x)  -  fo{x)\'dx 


V.J  X 


1/p 


< 


x"- 


\KA  -  fo{x)\Mx 


1/p 


(2.6) 


3.  Suppose  that  f{x)  and  fo{x)  have  the  following  properties:  there  exist  subsets 
PCj  C  X  and  Xj  C  X,  each  of  measure  5  >  0,  and  a  subset  X_^  C  X'^~^ ,  of  measure 
u  >  Q,  such  that  for  all  x  =  {xj,x^j)  and  x'  =  {x'-,x^j),  with  x'j  6  Xj,  Xj  G  Xj, 
x.j  e  X^j,  we  have  that  (i)  x'j  >  Xj,  (ii)  f[x)  >  f{x')  +  e,  and  (Hi)  fo{x')  >  foix)  +  e, 
for  some  e  >  0. 

(a)  Then,  for  any  p  €  [1,  oo], 


X'' 


\f;{x)-fo{x)\''dx 


1/p 


< 


X'' 


\f{x)  -  fo{x)\^dx-r]p6u 


tVp 


(2.7) 


where  rip  =  mi{\v -t'Y +  \v'  -t\P  -\v -t^"  -\v'  -  t'\T],  and  rjp  >  0  for  p  e  (l,oo),  with 
the  infimum  taken  over  all  v,  v' ,  t,  t'  in  the  set  K  such  that  v'  >  v  +  e  and  t'  >  t  +  e. 

(b)  Further,  for  an  ordering  n  =  (ttj,  ...,7rfc,  ...,7rd)  with  tt^  =  j,  let  f  be  a  partially 
rearranged  function,  f{x)  =  R^^^^  o  ...  o  R^^  o  f{x)  (for  k  =  d  we  set  f{x)  =  f{x)).  If 
the  function  f{x)  and  the  target  function  /o(x)  satisfy  the  condition  stated  above,  then, 
for  any  p  6  [l,oo], 


Ux<^ 


\f:{x)  -  fo{x)rdx 


1/p 


< 


X'' 


\f{x)  -  fQ{x)\^dx-r]p5i 


1/p 


(2.8) 
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4-    The  approximation  error  of  an  average  rearrangement  is  weakly  smaller  than  the 
average  approximation  error  of  the  individual  n-  rearrangements:  For  any  p  E  [1,00], 


/     \f*{x)-fo{xWdx 


-  n  ^ 

'         '    7760 


\f:{x)-f,{x)Ydx 

VJx'i 


(2.9) 


This  proposition  generalizes  the  results  of  Proposition  1  to  the  multivariate  case, 
also  demonstrating  several  features  unique  of  the  multivariate  case.  We  see  that  the 
TT-rearranged  functions  are  monotonic  in  all  of  the  arguments.  The  rearrangement  along 
any  argument  improves  the  approximation  properties  of  the  estimate.  Moreover,  the 
improvement  is  strict  when  the  rearrangement  with  respect  to  a  j-th  argument  is  per- 
formed on  an  estimate  that  is  decreasing  in  the  j-th  argument,  while  the  target  function 
is  increasing  in  the  same  j-th  argument,  in  the  sense  precisely  defined  in  the  proposition. 
Moreover,  averaging  different  7r-rearrangements  is  better  (on  average)  than  using  a  single 
TT-rearrangement  chosen  at  random.  All  other  basic  implications  of  the  proposition  are 
similar  to  those  discussed  for  the  univariate  case. 


3.  Proofs  of  Propositions 

3.1.  Proof  of  Proposition  1.  The  first  part  establishes  the  weak  inequality,  following 
in  part  the  strategy  in  Lorentz's  (1953)  proof.  The  proof  focuses  directly  on  obtaining 
the  result  stated  in  the  proposition.  The  second  part  establishes  the  strong  inequality. 

Proof  of  Part  1.  We  assume  at  first  that  the  functions  /(■)  and  /o(-)  are  simple 
functions,  constant  on  intervals  ((s  —  l)/r,  s/r],  s  —  l,...,r.  For  any  simple  /(•)  with  r 
steps,  let  /  denote  the  r-vector  with  the  s-th  element,  denoted  /s,  equal  to  the  value  of 
/(•)  on  the  5-th  interval.  Let  us  define  the  sorting  operator  S{f)  as  follows:  Let  £  be  an 
integer  in  1, ...,  r  such  that  /^  >  /^  for  some  m  >  I.  Hi  does  not  exist,  set  S{f)  =  /.  If 
i  exists,  set  S{f)  to  be  a  r-vector  with  the  f-th  element  equal  to  fm,  the  m-th  element 
equal  to  fg,  and  all  other  elements  equal  to  the  corresponding  elements  of  /.  For  any 
submodular  function  L  :  M^  — >  1R+,  by  /^  >  /„,  /om  >  foe  and  the  definition  of  the 
submodularity, 

L{fm,  fod  +  L{f,,  fom)   <  Life,  foe)  +  L{fm,  fom). 


11 
Therefore,  we  conclude  that 

/  L{Sif){x),  Mx))dx  <  [  L{f{x),  fo{x))dx,  (3.1) 

using  that  we  integrate  simple  functions. 

Applying  the  sorting  operation  a  sufficient  finite  number  of  times  to  /,  we  obtain  a 
completely  sorted,  that  is,  rearranged,  vector  /*.  Thus,  we  can  express  /*  as  a  finite 
composition  f*  =  S  o  ...  o  S{f)  .  By  repeating  the  argument  above,  each  composition 
weakly  reduces  the  approximation  error.  Therefore, 

/  L{r{x),  fo{x))dx  <    f  L{§_^^{f),  fo{x))dx  <   [  L{f{x),  Mx))dx.        (3.2) 
Jx  JX       7~r^'  -JX 


finite  times 


Furthermore,  this  inequality  is  extended  to  general  measurable  functions  /(•)  and  /o(-) 
mapping  A'  to  ii'  by  taking  a  sequence  of  bounded  simple  functions  f^^\-)  and  /q  (•) 
converging  to  /(■)  and  /o(-)  almost  everywhere  as  r  -^  oo.  The  almost  everywhere 
convergence  of  f^^\-)  to  /(•)  implies  the  almost  everywhere  convergence  of  its  quantile 
function  f*^'^\-)  to  the  quantile  function  of  the  limit,  /*(■)•  Since  inequahty  (3.2)  holds 
along  the  sequence,  the  dominated  convergence  theorem  implies  that  (3.2)  also  holds  for 
the  general  case.  D 

Proof  of  Part  2.  Let  us  first  consider  the  case  of  simple  functions,  as  defined  in  Part 
1.  We  take  the  functions  to  satisfy  the  following  hypotheses:  there  exist  regions  Xq  and 
A'o,  each  of  measure  greater  than  5  >  0,  such  that  for  all  x  G  P^o  and  x'  E.  Xq,  we  have 
that  (i)  x'  >  X,  (ii)  f{x)  >  f{x')  +  e,  and  (iii)  /o(x')  >  fo{x)  +  e,  for  e  >  0  specified. in 
the  proposition.  For  any  strictly  submodular  function  L  :  R^  ^  R+  we  have  that 

r?  =  inf {L(i;',  t)  +  L{v,  t')  -  L{v,  t)  -  L{v' ,  t')}  >  0, 

where  the  infimum  is  taken  over  all  v,v',t,t'  in  the  set  K  such  that  v'  >  v  +  e  and 
t'  >t  +  e. 

We  can  begin  sorting  by  exchanging  an  element  f{x),  x  e  Xq,  oi  r- vector  /  with  an 
element  f{x'),  x'  G  Xq,  of  r-vector  /.  This  induces  a  sorting  gain  of  at  least  r]  times  1/r. 
The  total  mass  of  points  that  can  be  sorted  in  this  way  is  at  least  6.  We  then  proceed  to 
sort  all  of  these  points  in  this  way,  and  then  continue  with  the  sorting  of  other  points. 


12 

After  the  sorting  is  completed,  the  total  gain  from  sorting  is  at  least  5r].  That  is, 


L{r{x)Jo{x))dx  <   f  L{f{x)Jo{x))dx-5T]. 

J  X 


IX  J  X 

We  then  extend  this  inequality  to  the  general  measurable  functions  exactly  as  in  the 
proof  of  part  one.  D 

3.2.  Proof  of  Proposition  2.  The  proof  consists  of  the  following  four  parts. 

Proof  of  Part  1.  We  prove  the  claim  by  induction.  The  claim  is  true  for  (i  =  1  by 
/*(a:)  being  a  quantile  function.  We  then  consider  any  d  >  2.  Suppose  the  claim  is 
true  in  d  —  1  dimensions.  If  so,  then  the  estimate  f{xj,X-j),  obtained  from  the  original 
estimate  f{x)  after  applying  the  rearrangement  to  ah  arguments  rc_j  of  x,  except  for  the 
argument  Xj,  must  be  weakly  increasing  in  x_j  for  each  Xj.  Thus,  for  any  x'_j  >  x_j, 
we  have  that 

f{X„x'_^)  >  f{X„x_j)  for  X,  ~  U{0, 1).  (3.3) 

Therefore,  the  random  variable  on  the  left  of  (3.3)  dominates  the  random  variable  on 
the  right  of  (3.3)  in  the  stochastic  sense.  Therefore,  the  quantile  function  of  the  random 
variable  on  the  left  dominates  the  quantile  function  of  the  random  variable  on  the  right, 
namely 

f*{xj,x'_j)  >  f*{xj,x^j)  for  each  xj  E  X  ^  (0, 1).  (3.4) 

Moreover,  for  each  x^j,  the  function  Xj  i-^  f*{xj,X-j)  is  weakly  increasing  by  virtue  of 
being  a  quantile  function.  We  conclude  therefore  that  x  i-^  fj{x)  is  weakly  increasing 
in  all  of  its  arguments  at  all  points  x  G  X'^.  The  claim  of  Part  1  of  the  Proposition  now 
follows  by  induction.  D 

Proof  of  Part  2  (a).  By  Proposition  1,  we  have  that  for  each  X-j, 

/   \f*{xj,x_j)  -  fo{xj,x_j)\'' dxj  <   /    \f{xj,x_j)-  fo{xj,x_j)\''dXj.  (3.5) 

Jx  Jx 

Now,  the  claim  follows  by  integrating  with  respect  to  x^j  and  taking  the  p-th  root  of 
both  sides.  For  p  =  oo,  the  claim  follows  by  taking  the  limit  as  p  -^  oo.  D 

Proof  of  Part  2  (b).  We  first  apply  the  inequality  of  Part  2(a)  to  f{x)  =  f{x),  then 
to  f{x)  =  R^ra  o  f{x),  then  to  /(x)  =  R„^_,  o  R^^  o  f{x),  and  so  on.  In  doing  so, 
we  recursively  generate  a  sequence  of  weak  inequalities  that  imply  the  inequality  (2.6) 
stated  in  the  Proposition.  O 
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Proof  of  Part  3  (a).  For  each  x_j  G  X'^~^  \  X^j,  by  Part  2(a),  we  have  the  weak 
inequality  (3.5),  and  for  each  X-^  G  A'_j,  by  the  inequality  for  the  univariate  case  stated 
in  Proposition  1  Part  2,  we  have  the  strong  inequality 


\f*{xj,x^j)-  fo{xj,X-j)\   dxj  <  /    \f{xj,x^j)-  fo{xj,x_j)\   dxj-r]p6,       (3.6) 
IX  Jx 

where  r}p  is  defined  in  the  same  way  as  in  Proposition  1.  Integrating  the  weak  inequality 

(3.5)  over  x_j  €  X'^"^  \  X^j,  of  measure  1  —  z/,  and  the  strong  inequality  (3.6)  over  X^j, 

of  measure  u,  we  obtain 


\f;ix)-fo{x)\'dx<   /     \f{x)-fo{x)\^dx-VpSi^. 

X<^  J  X<i 


(3.7) 


The  claim  now  follows.  D 

Proof  of  Part  3  (b).  As  in  Part  2(a),  we  can  recursively  obtain  a  sequence  of  weak 
inequalities  describing  the  improvements  in  approximation  error  from  rearranging  se- 
quentially with  respect  to  the  individual  arguments.  Moreover,  at  least  one  of  the 
inequalities  can  be  strengthened  to  be  of  the  form  stated  in  (3.7),  from  the  assumption 
of  the  claim.  The  resulting  system  of  inequalities  yields  the  inequality  (2.8),  stated  in 
the  proposition.  D 

Proof  of  Part  4.  We  can  write 


Ux'^ 


nx)-fo{x) 


p 
dx 

i/p 

Jx'i 

1 

1    V- 

'  Tre 

n  ^ 

Tren 


X'' 


f:{x)-Mx) 


dx 

i/p 


i/p 


(3. 


dx 


where  the  last  inequality  follows  by  pulling  out  l/|n|  and  then  applying  the  triangle 
inequality  for  the  Lp  norm.  D 


4.  Illustrations 

In  this  section  we  provide  an  empirical  application  to  biometric  age-height  charts. 
We  show  how  the  rearrangement  monotonizes  and  improves  various  nonparametric  esti- 
mates, and  then  we  quantify  the  improvement  in  a  simulation  example  that  mimics  the 
empirical  application. 
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4.1.  An  Empirical  Illustration  with  Age-Height  Reference  Charts.  Since  their 
introduction  by  Quetelet  in  tlie  19th  century,  reference  growth  charts  have  become  com- 
mon tools  to  asses  an  individual's  health  status.  These  charts  describe  the  evolution 
of  individual  anthropometric  measures,  such  as  height,  weight,  and  body  mass  index, 
across  different  ages.  See  Cole  (1988)  for  a  classical  work  on  the  subject  and  Wei,  Pere, 
Koenker,  and  He  (2006)  for  a  recent  analysis  from  a  quantile  regression  perspective  and 
additional  references. 

To  illustrate  the  properties  of  the  rearrangement  method  we  consider  the  estimation 
of  growth  charts  for  height.  It  is  clear  that  height  should  naturally  follow  an  increasing 
relationship  with  age.  Our  data  consist  of  repeated  cross  sectional  measurements  of 
height  and  age  from  the  2003-2004  National  Health  and  Nutrition  Survey  collected  by 
the  National  Center  for  Health  Statistics.  Height  is  measured  as  standing  height  in 
centimeters,  and  age  is  recorded  in  months  and  expressed  in  years.  To  avoid  confounding 
factors  that  might  affect  the  relationship  between  age  and  height,  we  restrict  the  sample 
to  US-born  white  males  age  two  through  twenty.  Our  final  sample  consists  of  533  subjects 
almost  evenly  distributed  across  these  ages. 

Let  Y  and  X  denote  height  and  age,  respectively.  Let  i?[y|J'(^  =  x]  denote  the  condi- 
tional expectation  of  Y  given  X  =  x,  and  Qy{u\X  =  x)  denote  the  u-th  quantile  of  Y 
given  X  =  x,  where  u  is  the  quantile  index.  The  population  functions  of  interests  are 
(1)  the  conditional  expectation  function  (CEF),  (2)  the  conditional  quantile  functions 
(CQF)  for  several  quantile  indices  (5%,  median,  and  95%),  and  (3)  the  entire  condi- 
tional quantile  process  (CQP)  for  height  given  age.  In  the  first  case,  the  target  function 
X  f->  /o(x)  is  x  I— >  £'['K|X  =  x];  in  the  second  case,  the  target  function  x  h->  fo{x)  is 
x  ^-^  (5y[u|A"  =  x],  for  u  =  5%,  50%,  and  95%;  and,  in  the  third  case,  the  target  func- 
tion (zi,x)  I— »  fo{u,x)  is  (u,x)  I—*  Qy[u\X  =  x].  The  natural  monotonicity  requirements 
for  the  target  functions  are  the  following:  The  CEF  x  k-+  £'[y|X  =  x]  and  the  CQF 
X  H-^  Qy{u\X  =  x)  should  be  increasing  in  age  x,  and  the  CQP  {u,x)  h^  (5y[u|X  =  x] 
should  be  increasing  in  both  age  x  and  the  quantile  index  u. 

We  estimate  the  target  functions  using  non-parametric  ordinary  least  squares  or  quan- 
tile regression  techniques  and  then  rearrange  the  estimates  to  satisfy  the  monotonicity 
requirements.  We  consider  (a)  kernel,  (b)  local  linear,  (c)  spline,  (d)  global  polynomial, 
(e)  Fourier,  and  (f)  flexible  Fourier  methods.    For  the  kernel  method,  we  provide  a  fit 
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on  a  cell-by-cell  basis,  with  each  cell  corresponding  to  one  month.  For  the  local  linear 
method,  we  choose  a  bandwidth  of  one  year  and  a  box  kernel.  For  the  sphne  method, 
we  use  cubic  B-splines  with  a  knot  sequence  {3,5,8, 10, 11.5, 13, 14.5, 16, 18},  following 
Wei,  Pere,  Koenker,  and  He  (2006).  For  the  global  polynomial  method,  we  fit  a  quartic 
polynomial.  For  the  Fourier  method,  we  employ  eight  trigonometric  terms,  with  four 
sines  and  four  cosines.  For  the  flexible  Fourier  method,  we  use  a  quadratic  polynomial 
and  four  trigonometric  terms,  with  two  sines  and  two  cosines.  Finally,  for  the  estima- 
tion of  the  conditional  quantile  process,  we  use  a  net  of  two  hundred  quantile  indices 
{0.005,0.010,  ...,0.995}.  In  the  choice  of  the  parameters  for  the  different  methods,  we 
select  values  that  either  have  been  used  in  the  previous  empirical  work  or  give  rise  to 
specifications  with  similar  complexities  for  the  different  methods. 

The  panels  A-F  of  Figure  1  show  the  original  and  rearranged  estimates  of  the  con- 
ditional expectation  function  for  the  different  methods.  All  the  estimated  curves  have 
trouble  capturing  the  slowdown  in  the  growth  of  height  after  age  sixteen  and  yield  non- 
monotonic-curves for  the  highest  values  of  age.  The  Fourier  series  have  a  special  difficulty 
approximating  the  aperiodic  age-height  relationship.  The  rearranged  estimates  correct 
the  non-monotonicity  of  the  original  estimates,  providing  weakly  increasing  curves  that 
coincide  with  the  original  estimates  in  the  parts  where  the  latter  are  monotonic.  More- 
over, the  rearranged  estimates  necessarily  improve  upon  the  original  estimates,  since, 
by  the  theoretical  results  derived  earlier,  they  are  closer  to  the  true  functions  than  the 
original  estimates.  We  quantify  this  improvement  in  the  next  subsection. 

Figure  2  displays  similar  but  more  pronounced  non-monotonicity  patterns  for  the 
estimates  of  the  conditional  quantile  functions.  The  rearrangement  again  performs  well 
in  delivering  curves  that  improve  upon  the  original  estimates  and  that  satisfy  the  natural 
monotonicity  requirement. 

Figures  3-7  illustrate  the  multivariate  rearrangement  of  the  conditional  quantile  pro- 
cess (CQP)  along  both  the  age  and  the  quantile  index  arguments.  We  plot  in  three 
dimensions  the  original  estimate,  its  age  rearrangement,  its  quantile  rearrangement,  and 
its  average  multivariate  rearrangement  (the  average  of  the  age-quantile  and  quantile-age 
rearrangements).  We  also  plot  the  corresponding  contour  surfaces.  (Here,  we  do  not 
show  the  multivariate  age-quantile  and  quantile-age  rearrangements  separately,  because 
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they  are  very  similar  to  the  average  multivariate  rearrangement.)  We  see  from  the  con- 
tour plots  that,  for  all  of  the  estimation  methods  considered,  the  estimated  CQP  is 
non-monotone  in  age  and  non-monotone  in  the  quantile  index  at  extremal  values  of  this 
index.  The  contour  plots  for  the  estimates  based  on  the  Fourier  series  best  illustrate  the 
non-monotonicity  problem.  We  see  that  the  average  multivarite  rearrangement  fixes  the 
non-monotonicity  problem,  and  delivers  an  estimate  of  the  CQP  that  is  monotone  in 
both  the  age  and  the  quantile  index  arguments.  Furthermore,  by  the  theoretical  results 
of  the  paper,  the  multivariate  rearranged  estimates  necessarily  improve  upon  the  original 
estimates. 

4.2.  Monte-Carlo  Illustration.  The  following  Monte  Carlo  experiment  quantifies  the 
improvement  in  the  estimation/approximation  properties  of  the  rearranged  estimates 
relative  to  the  original  estimates.  The  experiment  closely  matches  the  empirical  appli- 
cation presented  above. 

Specifically,  we  consider  the  design  where  the  outcome  variable  Y  equals  a  location 
function  plus  a  disturbance  e,Y  =  Z{X)'P  +  e,  and  the  disturbance  is  independent  of  the 
regressor  X.  The  vector  Z{X)  includes  a  constant  and  a  piecewise  linear  transformation 
of  the  regressor  X  with  three  changes  of  slope,  namely  Z{X)  =  (1,  X,  1{X  >  5}  •  (X  — 
5),  1{A"  >  10}  •  {X  -  10),  1{X  >  15}  •  {X  -  15)).  This  design  implies  the  conditional 
expectation  function 

E[Y\X]  =  Z{Xyp,  (4.1) 

and  the  conditional  quantile  function 

QY{u\X)  =  Z{Xyf3  +  Q,{u).  (4.2) 

We  select  the  parameters  of  the  design  to  match  the  empirical  example  of  growth  charts 
in  the  previous  subsection.  Thus,  we  set  the  parameter  P  equal  to  the  ordinary  least 
squares  estimate  obtained  in  the  growth  chart  data,  namely  (71.25,  8.13,  —2.72,  1.78, 
—6.43).  This  parameter  value  and  the  location  specification  (4.2)  imply  a  model  for  CEF 
and  CQP  that  is  monotone  in  age  over  the  range  of  2-20.  To  generate  the  values  of  the 
dependent  variable,  we  draw  disturbances  from  a  normal  distribution  with  the  mean  and 
variance  equal  to  the  mean  and  variance  of  the  estimated  residuals,  e  =  Y  —  Z{X)' [3, 
in  the  growth  chart  data.    We  fix  the  regressor  X  in  all  of  the  replications  to  be  the 
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observed  values  of  age  in  the  growth  chart  data  set.  In  each  rephcation,  we  estimate  the 
CEF  and  CQP  using  the  nonparametric  methods  described  in  the  previous  section. 

In  Table  1  we  report  the  average  LP  errors  (for  p  =  1,2,3,4  and  oo)  for  the  original 
estimates  and  the  rearranged  estimates  of  the  CEF.  We  also  report  the  relative  efficiency 
of  the  two  estimates,  measured  as  the  ratio  of  the  average  error  of  the  rearranged  estimate 
to  the  average  error  of  the  original  estimate.  We  calculate  the  average  W  error  as  the 
Monte  Carlo  average  of 


U':  = 


i/p 
\f{x)~Ux)Tdx        ., 


X 


where  the  target  function  /o(x)  is  the  CEF  Efl^lX  =  x],  and  the  estimate  /(x)  denotes 
either  the  original  nonparametric  estimate  of  the  CEF  or  its  increasing  rearrangement. 
For  all  of  the  methods  considered,  we  find  that  the  rearranged  curves  estimate  the  true 
CEF  more  accurately  than  the  original  curves,  providing  a  2%  to  84%  reduction  in  the 
average  error,  depending  on  the  method  and  the  norm  (i.e.  values  of  p). 

In  Table  2  we  report  the  average  L^  errors  for  the  original  estimates  of  the  conditional 
quantile  process  and  their  multivariate  rearrangement  with  respect  to  the  age  and  quan- 
tile  index  arguments.  We  also  report  the  ratio  of  the  average  error  of  the  rearranged 
estimate  to  the  average  error  of  the  original  estimate.  The  average  Lp  error  is  the  Monte 
Carlo  average  of 

-  i/p 
LP  ■=      II    \f{u,x)-  fo{u,x)\Pdxdu 


U  J  X 


where  the  target  function  fo{u,x)  is  the  conditional  quantile  process  Qy{u\X  =  x),  and 
the  estimate  f{u,  x)  denotes  either  the  original  nonparametric  estimate  of  the  conditional 
quantile  process  or  its  multivariate  rearrangement.  We  present  the  results  for  the  av- 
erage multivariate  rearrangement  only.  The  age-quantile  and  quantile-age  multivariate 
rearrangements  give  errors  that  are  very  similar  to  their  average  multivariate  rearrange- 
ment, and  we  therefore  do  not  report  them  separately.  For  all  the  methods  considered, 
we  find  that  the  multivariate  rearranged  curves  estimate  the  true  CQP  more  accurately 
than  the  original  curves,  providing  a  4%  to  74%  reduction  in  the  approximation  error, 
depending  on  the  method  and  the  norm. 
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In  Table  3  we  report  the  average  Lp  error  for  the  univariate  rearrangements  of  the 
conditional  quantile  function  along  either  the  age  argument  or  the  quantile  index  ar- 
gument. We  also  report  the  ratio  of  the  average  error  for  these  rearrangements  to  the 
average  error  of  the  original  estimates.  For  all  of  the  methods  considered,  we  find  that 
these  rearranged  curves  estimate  the  true  CQP  more  accurately  than  the  original  curves, 
providing  noticeable  reductions  in  the  estimation  error.  Moreover,  in  this  case  the  re- 
arrangement along  the  age  argument  is  more  effective  in  reducing  the  estimation  error 
than  the  rearrangement  along  the  quantile  index.  Furthermore,  by  comparing  Tables  2 
and  3,  we  also  see  that  the  multivariate  rearrangement  provides  an  improvement  over 
the  individual  univariate  rearrangements,  yielding  estimates  of  the  CQP  that  are  often 
much  closer  to  the  true  process. 

5.  Conclusion 

Suppose  that  a  target  function  is  known  to  be  weakly  increasing,  and  we  have  an 
original  estimate  of  this  function,  which  is  not  weakly  increasing.  Common  estima- 
tion methods  provide  estimates  with  such  a  property.  We  show  that  these  estimates 
can  always  be  improved  using  rearrangement  techniques.  The  univariate  and  multivari- 
ate rearrangement  methods  transform  the  original  estimate  to  a  monotonic  estimate. 
The  resulting  monotonic  estimate  is  in  fact  closer  to  the  target  function  in  common 
metrics  than  the  original  estimate.  We  illustrate  these  theoretical  results  with  a  com- 
putational example  and  an  empirical  example,  dealing  with  estimation  of  conditional 
mean  and  quantile  functions  of  height  given  age.  The  rearrangement  both  monotonizes 
and  improves  the  original  non-monotone  estimates.  It  would  be  interesting  to  determine 
whether  this  improved  estimation/approximation  property  carries  over  to  other  methods 
of  monotonization.  We  leave  this  extension  for  future  research. 
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Table  1.  L^  Estimation/Approximation  Error  of  Original  and  Rear- 
ranged Estimates  of  the  Conditional  Expectation  Function,  for  p  = 
1,2,3,4,  and  oo.  Univariate  Rearrangement. 


p 

LI 

LI       . 

^'r/li 

L'o 

Lr 

Lr/^o 

A.  Kernel 

B.  Local  Polynomial 

1 

3.69 

1.33 

0.36 

0.79 

0.76 

0.96 

2 

4.80 

1.84 

0.38 

1.00 

0.96 

0.96 

3 

5.81 

2.46 

0.42 

1.17 

1.13 

0.96 

4 

6.72 

3.12 

0.46 

1.33 

1.28 

0.96 

00 

16.8 

9.84 

0.58 

2.96 

2.81 

0.95 

C.  Splines 

D.  Quartic 

1 

0.87 

0.81 

0.93 

1.33 

1.19 

0.89 

2 

1.10 

1.02 

0.93 

1.64 

1.46 

0.89 

3 

1.31 

1.22 

0.93 

1.89 

1.68 

0.89 

4 

1.52 

1.39 

0.92 

2.10 

1.87 

0.89 

00 

3.72 

3,19 

0.86 

4.38 

3.79 

0.87 

E.  Fourier 

F.  Flexible  Fc 

urier 

1 

6.57 

3.21 

0.49 

0.73 

0.72 

0.97 

2 

10.7 

3.79 

0.35 

0.91 

0.89 

0.97 

3 

15.2 

4.24 

0.28 

1.06 

1.04 

0.98 

4 

19.0 

4.59 

0.24 

1.18 

1.16 

0.98 

oo 

48.9 

7.79 

0.16 

2.44 

2.40 

0.98 

Notes:  The  table  is  based  on  10,000  replications. 
Lq  is  the  LP  error  of  the  original  estimate. 
L^  is  the  LP  error  of  the  rearranged  estimate. 
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Table  2.  U  Estimation/ Approximation  Error  of  Original  and  Rear- 
ranged Estimates  of  the  Conditional  Quantile  Process,  for  p  =  1,2,3,4, 
and  oo.  Average  Multivariate  Rearrangement. 


p 

L'o 

JP                     TV         ITP 

^RR         ^RRl^O 

LI 

Lrr 

Lrr/^o 

A.  Kernel 

B. 

Local  Polynomial 

1 

5.35 

3.13 

0.58 

1.21 

1.09 

0.91 

2 

6.97 

4.37 

0.63 

1.61 

1.46 

0.91 

3 

8.40 

5.49 

0.65 

2.03 

1.84 

0.91 

4 

9.72 

6.49 

0.67 

2.48 

2.24 

0.91 

00 

34.3 

26.4 

0.77 

12.3 

10.4 

0.84 

C.  Splines 

D.  Quartic 

1 

1.33 

1.20 

0.90 

1.49 

1.35 

0.90 

2 

1.78 

1.60 

0.90 

1.87 

1.69 

0.90 

3 

2.30 

2.03 

0.88 

2.23 

1.99 

0.89 

4 

2.92 

2.50 

0.86 

2.62 

2.29 

0.87 

oo 

16.9 

12.1 

0.72 

12.6 

8.61 

0.68 

E.  Fourier 

F. 

Flexible  Fourier 

1 

6.72 

4.18 

0.62 

1.05 

1.00 

0.96 

2 

13.7 

5.35 

0.39 

1.38 

1.31 

0.95 

3 

20.8 

6.36 

0.31 

1.72 

1.63 

0.95 

4 

26.7 

7.25 

0.27  . 

2.12 

1.98 

0.94 

oo 

84.9 

21.9 

0.26 

10.9 

9.13 

0.84 

Notes:  The  table  is  based  on  1,000  replications. 

Lq  is  the  LP  error  of  the  original  estimate. 

^'rr  '^  ^^^  L^  error  of  the  average  multivariate  rearranged  estimate. 
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Table  3.  L^  Estimation/ Approximation  Error  of  Rearranged  Estimates 
of  tlie  Conditional  Quantile  Process,  for  p  =  l,2,3,4,oo.  Univariate  Re- 
arrangements. 


p 

^Ru 

^k 

^rJ^o 

^rJ^o 

L%, 

^l. 

^Rul^O 

^'kr/^O 

A 

Kernel 

B.  Local  Polynomial 

.1 

5.35 

3.13 

1.00 

0.58 

1.20 

1.10 

1.00 

0.91 

2 

6.97 

4.37 

1.00 

0.63 

1.60 

1.47 

1.00 

0.91 

3 

8.40 

5.49 

1.00 

0.65 

2.01 

1.85 

0.99 

0.91     . 

4 

9.72 

6.49 

1.00 

0.67 

2.45 

2.26 

0.99 

0.91 

oo 

34.3 

26.4 
C. 

1.00 
Splines 

0.77 

11.8 

10.8 
D. 

0.96 
Quartic 

0.88 

1 

1.31 

1.21 

0.99 

0.91 

1.49 

1.35 

1.00 

0.91 

2 

1.75 

1.63 

0.98 

0.91 

1.87 

1.69 

1.00 

0.90 

3 

2.24 

2.08 

0.97 

0.90 

2.22 

2.00 

0.99 

0.90 

4 

2.80 

2.59 

0.96 

0.89 

2.60 

2.30 

0.99 

0.88 

oo 

14.4 

13.9 

0.85 

0.82 

11.9' 

9.11 

0.95 

0.72 

E. 

Fourier 

F.  Flexible  Fourier 

1 

6.71 

4.19 

1.00 

0.62 

1.04 

1.01 

0.99 

0.96 

2 

13.7 

5.36 

1.00 

0.39 

1.36 

1.32 

0.99 

0.96 

3 

20.8 

6.37 

1.00 

0.31 

1.70 

1.65 

0.99 

0.96 

4 

26.7 

7.26 

1.00 

0.27 

2.08 

2.02 

0.98 

0.95 

oo 

84.9 

22.2 

1.00 

0.26 

10.0 

9.86 

0.92 

0.91 

Notes.  The  table  is  based  on  1,000  replications. 
Lq  is  the  LP  error  of  the  original  estimate. 
L^    is  the  LP  error  of  the  estimate  rearranged 
!/;;_  is  the  Lf  error  of  the  estimate  rearranged 


in  age  x. 

in  the  quantile  index  u. 
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A.  CEF  (Kernel) 


B.  CEF  (Local  Pol.) 


Age 


Age 


C.  CEF  (Splines) 


D.  CEF  (Quartic) 


Age 


Age 


E.  CEF  (Fourier) 


F.  CEF  (Flexible  Fourier) 


.^'      -a- 

X 


Age 


Figure  1.  Nonparametric  estimates  of  the  Conditional  Expectation 
Function  (CEF)  of  height  given  age  and  their  increasing  rearrangements. 
Nonparametric  estimates  are  obtained  using  kernel  regression  (A) ,  locally 
linear  regression  (B),  cubic  B-sphnes  series  (C),  a  four  degree  polynomial 
(D),  Fourier  series  (E),  and  flexible  Fourier  series  (F). 
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A.  CQF:  5%,  50%,  95%  (Kernel) 


B.  CQF:  5%,  50%,  95%  (Local  Pol.) 


Age 


Age 


C.  CQF:  5%,  50%,  95%  (Splines) 


D.  CQF:  5%,  50%,  95%  (Quartic) 


Age 


Age 


E.  CQF:  5%,  50%,  95%  (Fourier) 


F.  CQF:  5%,  50%,  95%  (Flexible  Fourier) 


Age 


Age 


Figure  2.  Nonparametric  estimates  of  the  5%,  50%,  and  95%  Condi- 
tional Quantile  Functions  (CQF)  of  lieight  given  age  and  their  increasing 
rearrangements.  Nonparametric  estimates  are  obtained  using  kernel  re- 
gression (A),  locally  linear  regression  (B),  cubic  B-splines  series  (C),  a 
four  degree  polynomial  (D),  Fourier  series  (E),  and  flexible  Fourier  series 
(F). 
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A.  CQP  (Kernel) 


B.  CQP:  Contour 


C.  CQP:  Age  Rearrangemenl 


-| 1 1 1 1 r- 

0-0  0.2  0.4  0-6  0.8  1.0 

quantile 

D.  CQP:  Contour  (R-Age) 


I  I  r 

0.0  0.2  0.4  o.e 


E.  CQP:  Quantile  Rearrangement 


F.  CQP:  Contour  (R-Quantile) 


G.  CQP:  Average  Quantlle/Age  Rearrangement 


H.  CQP:  Contour  (RR-Quantile/Age) 


o 

- 
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_, 

1 

1 
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.,. 
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0-0  0.2  0.4  0.6 

quantile 


Figure  3.  Kernel  estimates  of  the  Conditional  Quantile  Process  (CQP) 
of  height  given  age  and  their  increasing  rearrangements.  Panels  C  and 
E  plot  the  pne  dimensional  increasing  rearrangement  along  the  age  and 
quantile  dimension  respectively;  panel  G  shows  the  average  multivariate 
rearrangement. 
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A.  COP  {Local  Pol.) 


B.  CQP:  Contour 


C.  CQP:  Age  Rearrangement 


E.  CQP:  Quantile  Rearrangement 


G.  CQP:  Average  Quantile/Age  Rearrangement 


quantile 
D.  CQP:  Contour  (R-Age) 


quantile 
F.  CQP:  Contour  {R-Quantlle) 
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quantile 

H.  CQP:  Contour  (RR-Quantile/Age) 


a  - 

V 
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quantile 

Figure  4.  Locally  linear  estimates  of  the  Conditional  Quantile  Process 
(CQP)  of  height  given  age  and  their  increasing  rearrangements.  Panels  C 
and  E  plot  the  one  dimensional  increasing  rearrangement  along  the  age  and 
quantile  dimension  respectively;  panel  G  shows  the  average  multivariate 
rearrangement. 
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A.  COP  (Splines) 


B.  COP:  Contour 


C.  COP:  Age  Rearrangement 


D.  CQP:  Contour  (R-Age) 


E.  CQP:  Quantile  Rearrangement 


F.  CQP:  Contour  (R-Quantlle) 


G.  CQP:  Average  Quantile/Age  Rearrangement 


H.  CQP:  Contour  (RR-Quantile/Age) 


Figure  5.  Cubic  B-splines  series  estimates  of  the  Conditional  Quantile 
Process  (CQP)  of  height  given  age  and  their  increasing  rearrangements. 
Panels  C  and  E  plot  the  one  dimensional  increasing  rearrangement  along 
the  age  and  quantile  dimension  respectively;  panel  G  shows  the  average 
multivariate  rearrangement. 
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A.  COP  (Quartic) 


C.  CQP:  Age  Rearrangement 


E.  CQP:  Quantile  Rearrangement 


G.  CQP:  Average  Quantile/Age  Rearrangement 


B.  COP:  Contour 
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F.  CQP:  Contour  (R-Quantile) 
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H.  CQP:  Contour  (RR-Quantile/Age) 
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Figure  6.  Quartic  polynomial  series  estimates  of  the  Conditional  Quan- 
tile Process  (CQP)  of  height  given  age  and  their  increasing  rearrange- 
ments. Panels  C  and  E  plot  the  one  dimensional  increasing  rearrange- 
ment along  the  age  and  quantile  dimension  respectively;  panel  G  shows 
the  average  multivariate  rearrangement. 
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A.  CQP  (Fourier) 


B.  CQP:  Contour 
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C.  CQP;  Age  Rearrangement 


D.  CQP:  Contour  (R-Age) 


E.  CQP;  Quantile  Rearrangement 


G.  CQP:  Average  Quantile/Age  Rearrangement 
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F.  CQP:  Contour  (R-Quantile) 


quanlile 
H.  CQP:  Contour  (RR-Quantile/Age) 
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Figure  7.  Fourier  series  estimates  of  the  Conditional  Quantile  Process 
(CQP)  of  height  given  age  and  their  increasing  rearrangements.  Panels  C 
and  E  plot  the  one  dimensional  increasing  rearrangement  along  the  age  and 
quantile  dimension  respectively;  panel  G  shows  the  average  multivariate 
rearrangement. 


A.  COP  (FlexIbJe  Fourier) 
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B.  COP:  Contour 
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C.  COP:  Age  Rearrangement 


D.  COP:  Contour  (R-Age) 


E.  COP:  Quantile  Rearrangement 


F.  CQP:  Contour  (R-Ouantile) 


G.  CQP:  Average  Quantile/Age  Rearrangement 


H.  CQP:  Contour  (RR-QuantJIe/Age) 


Figure  8.  Flexible  Fourier  form  series  estimates  of  the  Conditional 
Quantile  Process  (CQP)  of  height  given  age  and  their  increasing  rearrange- 
ments. Panels  C  and  E  plot  the  one  dimensional  increasing  rearrangement 
along  the  age  and  quantile  dimension  respectively;  panel  G  shows  the  av- 
erage multivariate  rearrangement. 
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